CRUDAW: A Novel Fuzzy Technique for Clustering Records Following User Defined Attribute Weights

نویسندگان

  • Md Anisur Rahman
  • Md Zahidul Islam
چکیده

We present a novel fuzzy clustering technique called CRUDAW that allows a data miner to assign weights on the attributes of a data set based on their importance (to the data miner) for clustering. The technique uses a novel approach to select initial seeds deterministically (not randomly) using the density of the records of a data set. CRUDAW also selects the initial fuzzy membership degrees deterministically. Moreover, it uses a novel approach for measuring distance considering the user defined weights of the attributes. While measuring the distance between the values of a categorical attribute the technique takes the similarity of the values into consideration instead of considering the distance to be either 0 or 1. Complete algorithm for CRUDAW is presented in the paper. We experimentally compare our technique with a few existing techniques – namely SABC, GFCM, and KL-FCM-GM based on various evaluation criteria called Silhouette coefficient, Fmeasure, purity and entropy. We also use t-test, confidence interval test and time complexity in evaluating the performance of our technique. Four data sets available from UCI machine learning repository are used in the experiments. Our experimental results indicate that CRUDAW performs significantly better than the existing techniques in producing high quality clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Multiple attribute decision making with triangular intuitionistic fuzzy numbers based on zero-sum game approach

For many decision problems with uncertainty, triangular intuitionistic fuzzy number is a useful tool in expressing ill-known quantities. This paper develops a novel decision method based on zero-sum game for multiple attribute decision making problems where the attribute values take the form of triangular intuitionistic fuzzy numbers and the attribute weights are unknown. First, a new value ind...

متن کامل

Soft Computing-based New Interval-valued Pythagorean Triangular Fuzzy Multi-criteria Group Assessment Method without Aggregation: Application to a Transport Projects Appraisal

In this paper, an interval-valued Pythagorean triangular fuzzy number (IVPTFN) as a useful tool to handle decision-making problems with vague quantities is defined. Then, their operational laws are developed. By introducing a novel method of making a decision on the concept of possibility theory, a multi-attribute group decision-making (MAGDM) problem is considered, in which the attribute value...

متن کامل

A novel ranking method for intuitionistic fuzzy set based on information fusion and application to threat assessment

A novel ranking method based on multi-time information fusion is proposed for intuitionistic fuzzy sets (IFSs) and applied to the threat assessment problem, a multi-attribute decision making (MADM) one. This method integrates a designed intuitionistic fuzzy entropy (IFE), the closeness degree of technique for order preference by similarity to ideal solution (TOPSIS), the decision maker¡¯s (DM¡¯...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012